06. TD Control: Q-Learning

TD Control: Q-Learning

Please watch the video below to learn about Q-Learning (or Sarsamax), a second method for TD control.

TD Control: Sarsamax

Check out this (optional) research paper to read the proof that Q-Learning (or Sarsamax) converges.

## Pseudocode